COS 511 : Theoretical Machine Learning

نویسنده

  • Moritz Hardt
چکیده

Suppose we are given examples x1, x2 . . . , xm drawn from a probability distribution D over some discrete space X. In the end, our goal is to estimate D by finding a model which fits the data, but is not too complex. As a first step, we need to be able to measure the quality of our model. This is where we introduce the notion of maximum likelihood. To motivate this notion suppose D is distributed according to one out of two possible density functions q1 and q2. Intuitively, if we observe that q1(xi) is typically much larger than q2(xi), we will tend to conclude that D is distributed according to q1.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Theoretical Machine Learning Cos 511 Lecture #9

In this lecture we consider a fundamental property of learning theory: it is amenable to boosting. Roughly speaking, boosting refers to the process of taking a set of rough “rules of thumb” and combining them into a more accurate predictor. Consider for example the problem of Optical Character Recognition (OCR) in its simplest form: given a set of bitmap images depicting hand-written postal-cod...

متن کامل

COS 511 : Theoretical Machine Learning

In other words, if ≤ 1/8 and δ ≤ 1/8, then PAC learning is not possible with fewer than d/2 examples. The outline of the proof is: To prove that there exists a concept c ∈ C and a distribution D, we are going to construct a fixed distribution D, but we do not know the exact target concept c used. Instead, we will choose c at random. If we get an expected probability of error over c, then there ...

متن کامل

COS 511 : Theoretical Machine Learning

as the price relative which is how much a stock goes up or down in a single day. St denotes the amount of wealth we have at the start of day t and we assume S1 = 1. We denote wt(i) to be the fraction of our wealth that we have in stock i at the beginning of day t which can be viewed as a probability distribution as ∀i, wt(i) ≥ 0 and ∑ iwt(i) = 1. We can then derive the total wealth in stock i a...

متن کامل

COS 511 : Theoretical Machine Learning

Last class, we discussed an analogue for Occam’s Razor for infinite hypothesis spaces that, in conjunction with VC-dimension, reduced the problem of finding a good PAClearning algorithm to the problem of computing the VC-dimension of a given hypothesis space. Recall that VC-dimesion is defined using the notion of a shattered set, i.e. a subset S of the domain such that ΠH(S) = 2 |S|. In this le...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008